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(57) A method and apparatus for identifying links in 
an electronic document provides an electronic file as a 
data structure having components and having base 
links that define the structural relationship between the 
components, traverses the data structure using the 



base links, and produces a virtual link between two com- 
ponents by recognizing a characteristic shared by the 
components. The virtual link is identified when needed 
at run-time. A function may be performed using the com- 
ponents as components are identified. 
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Description 

BACKGROUND OF THE INVENTION 

[0001] This invention identifies components in an 
electronic file. 

[0002] An electronic document typically has informa- 
tion content, such as text, graphics, and tables, and for- 
matting content that directs how to display the informa- 
tion content. Document publishing systems, which in- 
clude word processing systems and desktop publishing 
systems, may store electronic documents as hierarchi- 
cal data structures. Such structures represent the infor- 
mation content and formatting content as nodes con- 
nected to one another in an ordered arrangement. 
[0003] A system traverses a data structure to gather 
data about the structure and to perform operations using 
that data. To traverse a hierarchical structure, the sys- 
tem follows a set of links from one node to another. 
[0004] The links between the nodes are sometimes 
described in terms of family relationships. A node at- 
tached to and above another node in the hierarchical 
structure is referred to as the parent of the latter node. 
A node attached to and below another node in the hier- 
archical structure is referred to as the child of the latter 
node. Nodes having the same parent are referred to as 
siblings. 

[0005] In addition to specifying nodal relationships in 
terms of familial links, systems may identify the relation- 
ship between nodes in terms of next and previous links. 
Next and previous links ignore the familial relationships 
and deal with incremental positions of nodes within a 
document. 

[0006] Familial links, and next and previous links will 
be referred to as "base links." The base links connect 
every node in the structure and define the structure's 
hierarchy. A system uses the base links to traverse the 
structure and discover the structure's organization. The 
structure's organization determines the order of 
processing for certain types of operations. For example, 
a spell checker may use the base links to examine each 
word in an electronic document from the beginning to 
the end of the document. The structure's organization 
also determines which nodes share behavior character- 
istics with other nodes. For example, a node may define 
paragraph characteristics that are inherited and refined 
by descendent nodes. 

[0007] Other than a set of base links that connect all 
nodes in a hierarchical data structure, a system can 
have sets of direct links to connect nodes in the same 
or in different branches of a hierarchical data structure. 
Direct links locate nodes that may have an effect upon 
each other under a certain set of circumstances. For ex- 
ample, if an author inserted a numbered section heading 
into a document, the system could use one set of direct 
links between numbered section heading nodes to find 
and renumber all subsequent section headings. Direct 
links are also useful in other situations, for example to 



identify components of a detailed outline, identify com- 
ponents of a brief outline, locate all index markers, and 
locate all bibliographic references. 

5 SUMMARY OF THE INVENTION 

[0008] In one aspect, the invention is directed to a 
computer-implemented method for identifying links in an 
electronic file that is expressed as a data structure hav- 
io ing components and base links. The base links define 
a structural relationship between the components. The 
method of the invention traverses the data structure us- 
ing the base links and produces a virtual link between 
components in the data structure by recognizing a char- 
ts acteristic shared by the components. 

[0009] The virtual link is identified when needed at 
run-time. A function, such as a renumbering function or 
a function that generates text, may be performed using 
each component that is virtually linked to another com- 
20 ponent. 

[001 0] A plurality of traversal routines can sequential- 
ly execute to identify a virtual link between components. 
The data structure can be hierarchical and the traversal 
path used by the traversal routines can be expressed in 
25 terms of family, next, and previous structural relation- 
ships. 

[0011] Among the advantages of the invention are 
one or more of the following. The invention only requires 
one set of base links. Eliminating all other links between 
30 components (e.g., direct links) eliminates the need to 
regenerate those other links when the structure is al- 
tered. Furthermore, memory requirements are reduced 
because multiple sets of links are not stored. 

35 BRIEF DESCRIPTION OF THE DRAWINGS 

[0012] The foregoing features and other aspects of 
the invention will become more apparent from the draw- 
ings taken together with the accompanying description, 
40 jn which: 

[0013] FIG. 1 is a block diagram of a computer plat- 
form suitable for supporting virtual navigators in accord- 
ance with the invention. 

[0014] FIG. 2 is a diagram of a hierarchy of compo- 
4 s nents in an electronic document. 

[0015] FIG. 3A is a diagram showing base links and 
virtual links. 

[0016] FIG. 3B is a diagram showing base links and 
virtual links. 

so [0017] FIG. 4 is a flow chart of the context in which a 
virtual navigator is used. 

[0018] FIG. 5 is an illustration of the cascading virtual 
navigators. 

[0019] FIG. 6 is a flow chart of the ordered-list virtual 
55 navigator. 
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DETAILED DESCRIPTION 

[0020] Referring now to FIG. 1, a computer platform 

1 00 suitable for supporting an electronic document pub- 
lishing system 101 is shown. The electronic document 
publishing system 1 01 includes one or more virtual nav- 
igators 102 on disk or in main memory. The computer 
platform 100 includes a digital computer 104, a display 
106, a keyboard 108, a mouse or other pointing device 
110. and a mass storage device 112 (e.g., hard disk 
drive, magneto-opticaf disk drive, or floppy disk drive). 
The computer 104 includes memory 120, a processor 
122, and other customary components, such as, mem- 
ory bus and peripheral bus (not shown). 

[0021] An electronic document 1 30 contains informa- 
tion stored on a hard disk or other computer-readable 
medium such as a diskette. A human-perceptible ver- 
sion of the electronic document 132 is viewable on the 
computer display 1 06 or as a hardcopy printout obtained 
through operation on the electronic document by a com- 
puter program. 

[0022] Referring now to FIG. 2, a group of compo- 
nents 201-206 organized as a hierarchical data struc- 
ture 200 is shown. The data structure 200 represents 
an electronic document. The components may be sec- 
tion headings, paragraphs, list items, and so forth. For 
example, component 202 and component 205 may be 
paragraphs, components 203 and 206 may be foot- 
notes, and component 204 may be an index entry. 
[0023] The electronic document publishing system 

101 uses base links to identify the interrelationship of all 
of the components in the hierarchical structure. Solid 
lines 250-256 between nodes 201-206 in FIG. 2 depict 
the familial, next, and previous links of the data structure 
200. The familial links and the next and previous com- 
ponent links may be specified and stored as attribute/ 
value pairs with each component. For example, an at- 
tribute may bo a parent link or a child link and a value 
may be a pointer to a parent node or child node. 
[0024] Rather than storing and maintaining additional 
links, such as direct links, the system 101 uses virtual 
navigators 102 (FIG. 1) to locate specific components 
in the data structure. A virtual navigator is a software 
routine. As the name implies, a virtual navigator identi- 
fies an apparent path between components by travers- 
ing the data structure through the base links. 

[0025] Shown in FIG. 3A are apparent path 357 and 
apparent path 358 between footnote component 203. 
index component 204, and footnote component 206. 
Footnotes 203 and 206 and index component 204 share 
the characteristic that they are anchored to another 
component, such as a paragraph, and are both a type 
of anchor component. An anchor virtual navigator pro- 
duces the virtual link 357 between footnote component 
203 and index component 204 by using base link 255, 
and produces the virtual (ink 358 between index com- 
ponent 204 and footnote component 206 by using base 
link 254, base link 252. and base link 256. 



[0026] Shown in FIG. 3B is a virtual link 359. Virtual 
link 359 is derived from virtual link 357 and virtual (ink 
358. The footnote virtual navigator produced virtuaf link 
359 using virtual link 357 and virtual link 358, which the 
5 anchor virtual navigator produced. 

[0027] The electronic document publishing system 
101 provides a virtual navigator for each type of compo- 
nent that needs to be identified. Examples of virtual nav- 
igators 102 include a footnote virtual navigator that lo- 

10 cates all footnotes, an ordered-list virtual navigator that 
locates all ordered lists, a numbered paragraph virtual 
navigator that locates all numbered paragraphs, and a 
paragraph virtual navigator that locates all paragraphs. 
[0028] In an object-oriented environment, a base vir- 

15 tual navigator class is the class from which all other vir- 
tual navigator classes are derived and thus all other vir- 
tual navigator classes inherit features from the base vir- 
tual navigator class. Each type of virtual navigator 102 
is defined by its own class and each virtual navigator 

20 102 is an object instantiated from that class. All virtual 
navigators 1 02 can inherit and use functions defined for 
any ancestral virtual navigator classes. 
[0029] Each virtual navigator 102 uses the base links 
of the hierarchical data structure or virtual links provided 

25 by other virtual navigators and identifies a set of com- 
ponents by recognizing common characteristics shared 
by the set of components. The virtual navigators 102 
need not construct or store a data structure on a com- 
puter medium or in a computer memory after identifying 

30 a set of components. A chain of components is discov- 
ered dynamically and each component is used for a spe- 
cific function at the time the component is discovered 
before the virtual navigator searches for another com- 
ponent in the chain. 

35 [0030] A virtual navigator may be used when an au- 
thor adds, deletes, moves, or modifies in some way, one 
or more components in the data structure 200. If the 
modification affects the way in which other components 
are numbered, a renumbering routine may be called to 

to renumber affected paragraphs. That routine may use a 
numbered paragraph virtual navigator, a footnote virtual 
navigator, or both, to identify components that need re- 
numbering. 

[0031] As an example, a virtual navigator may be 
45 called when a new section heading is inserted between 
existing section headings in an electronic document. 
Thus, if a new section heading is inserted between Sec- 
tion 2.0 and Section 3.0, the virtual navigator identifies 
all numbered section headings from Section 3.0 through 
50 the end of the electronic document. When a section 
heading is identified, a routine, such as the routine that 
called the virtual navigator, renumbers the heading. 
Section 3.0 will become 4.0, Section 3.1 will become 
4.1, and so on. 

55 [0032] The virtual navigators 102 use protocols based 
on traversal methods that obtain the parent, next child, 
previous child, first child, last child, and next and previ- 
ous components. Each virtual navigator 102 imple- 
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ments at least one traversal routine tailored to a specific 
type of component and considers the linkage require- 
ments for the component type. For example, a num- 
bered paragraph virtual navigator has three traversal 
routines, "GetParenf, "GetNext'*, and "GetPrev", that 
recognize a numbered paragraph component. A para- 
graph virtual navigator has traversal routines "GetPar- 
ent", "GetNext", "GetPrev", "GetNextChild", "Get- 
PrevChi'd", "GetFirstChild*. and "GetLastChild" that 
recognize paragraph components. 
[0033] FIG. 4 illustrates the use of a virtual navigator. 
First, the electronic document is stored as a hierarchical 
data structure 200 having base structural links (step 
410). When the electronic document publishing system 
101 needs to perform a task on particular components, 
a virtual navigator is called to identify the components. 
A link between identified components is not stored, so 
the virtual navigator produces a virtual link between 
components as the components are identified (step 
420). The virtual navigator derives the virtual link by call- 
ing other virtual navigators. The virtual navigators use 
the base links, which may simply be pointers from one 
component to another, of the hierarchical data structure 
to identify the particular set of components. 
[0034] To derive a virtual link, a virtual navigator iden- 
tifies a component having a specific characteristic (step 
460), which will be discussed. The routine that called 
the virtual navigator may perform an operation using the 
identified component (step 470). After the operation is 
performed, the virtual navigator may be called again to 
search for another component having the specified 
characteristic. The cycle of calling the virtual navigator 
and performing a function is repeated until the calling 
routine determines that all components were identified. 
For example, the calling routine may need the entire hi- 
erarchical data structure traversed or only need to iden- 
tify components in a specific branch. 
[0035] Due to similar linkage requirements, virtual 
navigators 102 call other virtual navigators that identify 
other types of components. Together, the virtual naviga- 
tors can traverse the entire hierarchical data structure 
via the base links. For example, an ordered-list compo- 
nent requires a numbered paragraph component to be 
its parent component, and a numbered paragraph com- 
ponent requires a paragraph component to be its parent. 
In this case, an ordered-list virtual navigator calls a num- 
bered paragraph virtual navigator, and the numbered 
paragraph virtual navigator calls the paragraph virtual 
navigator. 

[0036] Shown in FIG. 5 is a conceptual representation 
of three virtual navigators interacting with one another 
to identify ordered-list components using Get Next 
traversal routines. An ordered-list class is derived from 
a numbered paragraph class and a numbered para- 
graph class is a class derived from a paragraph class. 
The ordered-list virtual navigator obtains the next or- 
dered-list component (step 460') by sequentially obtain- 
ing the next numbered paragraph until an ordered-list 



component is found (step 520). To obtain the next num- 
bered paragraph, the numbered paragraph virtual nav- 
igator sequentially gets the next paragraph until a num- 
bered paragraph is found (step 530). This cascading ef- 
5 feet can continue up to the virtual navigator that identi- 
fies a component in the class from which all component 
classes are derived. 

[0037] Referring to FIG. 6, is an illustrative example 
of the ordered-list virtual navigator's GetNext traversal 
10 routine 460* identifying components that are ordered 
lists. The ordered-list virtual navigator's GetNext routine 
460' begins by getting the next numbered paragraph in 
the structure (step 521 ). The ordered-list virtual naviga- 
tor's GetNext routine 460' calls the numbered paragraph 
'5 virtual navigator's GetNext routine 520 (step 521). The 
ordered-list virtual navigator tests whether a numbered 
paragraph was returned (step 522) and whether the 
numbered paragraph is an ordered-list component (step 
524). If an ordered-list component was returned, the or- 
dered-list virtual navigator returns (step 526) and the 
calling routine can perform a prescribed functbn using 
the ordered-list component. For example, the function 
may increment a section number. If a numbered para- 
graph was returned, but it was not an ordered-list com- 
ponent, the ordered-list virtual navigator continues to 
search for an ordered-list component. If a numbered 
paragraph was not returned, the entire structure was tra- 
versed and the ordered-list virtual navigator returns to 
the calling routine (step 526). 

[0038] Getting the next numbered paragraph follows 
a similar technique. To obtain the next numbered para- 
graph, the numbered paragraph virtual navigator's Get- 
Next traversal routine 520 calls the paragraph virtual 
navigator's GetNext traversal routine 530 (step 531), 
tests whether a paragraph was returned (step 532), and 
if a paragraph was returned, tests whether the para- 
graph is a numbered paragraph (step 534). If the para- 
graph was not a numbered paragraph, the numbered 
paragraph virtual navigator's GetNext routine 520 re- 
peats steps 531-534 until a numbered paragraph is re- 
turned or the numbered paragraph virtual navigator has 
traversed the structure. 

[0039] To gel the next paragraph, the paragraph vir- 
tual navigator obtains the next component because a 
paragraph is derived Irom a component. The paragraph 
virtual navigator's GetNext routine 530 is called to obtain 
the next component. The paragraph virtual navigator's 
GetNext routine calls the component virtual navigator's 
GetNext traversal routine (step 541) and tests whether 
a component was returned (step 542), and if so, whether 
the component is a paragraph (step 544). If the compo- 
nent was not a paragraph, the paragraph virtual naviga- 
tor's GetNext routine repeats steps 541 -544 until a par- 
agraph is returned or the paragraph virtual navigator has 
traversed the structure. 

[0040] Other embodiments are within the scope of the 
following claims. Rather than one virtual navigator call- 
ing another virtual navigator, a virtual navigator can in- 
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dude the functionality of several virtual navigators. Ad- 
ditional object classes (e.g., containers), traversal func- 
tions, and navigators may be implemented. Virtual nav- 
igators can produce virtual links for linked data struc- 
tures other than hierarchical data structures. Other func- 
tions may be performed after a component is identified, 
including generating bibliographies, endnotes, tables of 
contents, and indices. 



Claims 

1. A computer-implemented method for identifying 
links in an electronic file that is expressed as a data 
structure having a plurality of components and base 
links that define a structural relationship between 
the components, the method comprising: 

traversing a data structure using a plurality of 
base links; and 

producing a virtual link between a first compo- 
nent and a second component in the data struc- 
ture by recognizing a characteristic shared by 
the first component and the second compo* 
nent. 

2. The method of claim 1 , wherein the virtual link is 
identified when needed at run-time. 

3. The method of claim 1 , further comprising: 

performing a function using the second com- 
ponent before the traversal of the data structure is 
completed. 

4. The method of claim 1 , further comprising: 

providing a plurality of traversal routines that 
sequentially execute to identify a virtual link be- 
tween components. 

5. The method of claim 1 , wherein the second compo- 
nent inherits features from a component class, and 
a traversal routine recognizes the second compo- 
nent by recognizing members of the component 
class until the second component is found. 

6. The method of claim 5, wherein the data structure 
is a hierarchical data structure and the traversal rou- 
tine specifies a traversal path in terms of family, 
next, and previous structural relationships. 

7. The method of claim 3, wherein the electronic file is 
an electronic document. 

8. The method of claim 7, wherein the function per- 
formed on the second component is a renumbering 
function. 

9. The method of claim 7, wherein the function per- 
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formed on the second component generates text. 

10. The method of claim 7, wherein the function per- 
formed on the second component locates a text 
string. 

11 . The method of claim 7, wherein the traversal routine 
identifies a plurality of virtual links between compo- 
nents. 

12. The method of claim 1 1 , wherein the data structure 
is a hierarchical data structure, the virtual links rep- 
resent a hierarchical subset of components in the 
hierarchical data structure, the traversal routine 
specifies a traversal path in terms of family, next, or 
previous structural relationships, and the traversal 
routine specifies components according to data 
type. 



20 13. A computer-implemented method for identifying 
links in an electronic file at run time, comprising: 

providing an electronic file as a hierarchical da- 
ta structure having a plurality of components 
and a plurality of base links that define a struc- 
tural relationship between the components; 
traversing the hierarchical data structure using 
a plurality of traversal routines that use the 
base links; 

defining the traversal routines as classes that 
inherit features from other traversal routine 
classes; 

having each traversal routine identify a plurality 
of links between a plurality of components in 
the hierarchical data structure by recognizing a 
characteristic shared by the components; and 
performing a function using each identified 
component at the time the component is iden- 
tified. 
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14. A computer program operating on an electronic file 
arranged as a data structure having a plurality of 
components and a plurality of base links that define 
a structural relationship between the components, 
the computer program residing on a computer- 
readable medium, comprising instructions causing 
a computer to: 

provide at least one traversal routine, with the 
traversal routine identifying a link between a first 
component and a second component in a data 
structure by traversing the data structure using the 
base links. 

15. The computer program of claim 14, wherein the sec- 
ond component inherits characteristics from a class 
of components and the traversal routine identifies 
the link by recognizing members of the class of 
components. 
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16. The computer program of claim 14, further compris- 
ing the instruction causing a computer to: 

perform a function using the second compo- 
nent at the time the component is identified. 

5 

17. The computer program of claim 14, wherein the 
electronic tile is an electronic document. 

1 8. The computer program of claim 1 7, further compris- 
ing the instruction causing a computer to: 10 

produce a plurality of links between a plurality 
of components in the data structure. 

19. The computer program of claim 17, further compris- 
ing the instruction causing a computer to: '5 

perform a function using the second compo- 
nent and subsequently linked components before 
traversal of the data structure is completed. 

20. The computer program of claim 19, wherein the 20 
function performed is a renumbering function. 

21. The computer program of claim 19, wherein the 
function performed generates text. 
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